
The aggregation approach implemented in TrajectoryCollectionAggregator is based on Andrienko, N., & Andrienko, G. (2011). Spatial generalization and aggregation of massive movement data. IEEE Transactions on visualization and computer graphics, 17(2), 205-219. and consists of the following main steps:
import pandas as pd
import geopandas as gpd
from geopandas import GeoDataFrame, read_file
from shapely.geometry import Point, LineString, Polygon
from datetime import datetime, timedelta
from holoviews import opts, dim
import movingpandas as mpd
import warnings
warnings.filterwarnings('ignore')
mpd.show_versions()
MovingPandas 0.9.rc2 SYSTEM INFO ----------- python : 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 05:37:49) [MSC v.1916 64 bit (AMD64)] executable : E:\Anaconda\envs\mpd-ex\python.exe machine : Windows-10-10.0.19041-SP0 GEOS, GDAL, PROJ INFO --------------------- GEOS : None GEOS lib : None GDAL : 3.2.1 GDAL data dir: None PROJ : 7.2.0 PROJ data dir: E:\Anaconda\envs\mpd-ex\Library\share\proj PYTHON DEPENDENCIES ------------------- geopandas : 0.10.2 pandas : 1.3.5 fiona : 1.8.18 numpy : 1.21.5 shapely : 1.7.1 rtree : 0.9.7 pyproj : 3.1.0 matplotlib : 3.5.1 mapclassify: 2.4.3 geopy : 2.2.0 holoviews : 1.14.6 hvplot : 0.7.3 geoviews : 1.9.2
FSIZE = 350
gdf = read_file('../data/geolife_small.gpkg')
traj_collection = mpd.TrajectoryCollection(gdf, 'trajectory_id', t='t')
traj_collection.hvplot(line_width=7.0, tiles='StamenTonerBackground', width=FSIZE, height=FSIZE)
help(mpd.TrajectoryCollectionAggregator)
Help on class TrajectoryCollectionAggregator in module movingpandas.trajectory_aggregator: class TrajectoryCollectionAggregator(builtins.object) | TrajectoryCollectionAggregator(traj_collection, max_distance, min_distance, min_stop_duration, min_angle=45) | | Methods defined here: | | __init__(self, traj_collection, max_distance, min_distance, min_stop_duration, min_angle=45) | Aggregates trajectories by extracting significant points, | clustering those points, and extracting flows between clusters. | | Parameters | ---------- | traj_collection : TrajectoryCollection | TrajectoryCollection to be aggregated | max_distance : float | Maximum distance between significant points (distance is | calculated in CRS units, except if the CRS is geographic, e.g. | EPSG:4326 WGS84, then distance is calculated in meters) | min_distance : float | Minimum distance between significant points | min_stop_duration : integer | Minimum duration required for stop detection (in seconds) | min_angle : float | Minimum angle for significant point extraction | | References | ---------- | * Andrienko, N., & Andrienko, G. (2011). Spatial generalization and | aggregation of massive movement data. IEEE Transactions on | visualization and computer graphics, 17(2), 205-219. | | get_clusters_gdf(self) | Return the extracted cluster centroids | | Returns | ------- | GeoDataFrame | Cluster centroids, incl. the number of clustered significant | points (n). | | get_flows_gdf(self) | Return the extracted flows | | Returns | ------- | GeoDataFrame | Flow lines, incl. the number of trajectories summarized in the | flow (weight). | | get_significant_points_gdf(self) | Return the extracted significant points | | Returns | ------- | GeoDataFrame | Significant points | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined)
Generalizing the trip trajectories significantly speeds up the following aggregation step.
generalized = mpd.MinDistanceGeneralizer(traj_collection).generalize(tolerance=100)
aggregator = mpd.TrajectoryCollectionAggregator(generalized, max_distance=1000, min_distance=100, min_stop_duration=timedelta(minutes=5))
pts = aggregator.get_significant_points_gdf()
clusters = aggregator.get_clusters_gdf()
( pts.hvplot(geo=True, tiles='StamenTonerBackground', width=FSIZE, height=FSIZE) *
clusters.hvplot(geo=True, color='red' ) )
flows = aggregator.get_flows_gdf()
( flows.hvplot(geo=True, hover_cols=['weight'], line_width=dim('weight')*7, color='#1f77b3', tiles='StamenTonerBackground') *
clusters.hvplot(geo=True, color='red', size=dim('n') ) )